NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Two-Dimensional Longest Common Extension Queries in Compact Space

https://doi.org/10.4230/LIPICS.STACS.2025.38

Ganguly, Arnab; Gibney, Daniel; Shah, Rahul; Thankachan, Sharma V (January 2025, Schloss Dagstuhl – Leibniz-Zentrum für Informatik)
Beyersdorff, Olaf; Pilipczuk, Michał; Pimentel, Elaine; Thắng, Nguyễn Kim (Ed.)
For a length n text over an alphabet of size σ, we can encode the suffix tree data structure in 𝒪(nlog σ) bits of space. It supports suffix array (SA), inverse suffix array (ISA), and longest common extension (LCE) queries in 𝒪(log^ε_σ n) time, which enables efficient pattern matching; here ε > 0 is an arbitrarily small constant. Further improvements are possible for LCE queries, where 𝒪(1) time queries can be achieved using an index of space 𝒪(nlog σ) bits. However, compactly indexing a two-dimensional text (i.e., an n× n matrix) has been a major open problem. We show progress in this direction by first presenting an 𝒪(n²log σ)-bit structure supporting LCE queries in near 𝒪((log_σ n)^{2/3}) time. We then present an 𝒪(n²log σ + n²log log n)-bit structure supporting ISA queries in near 𝒪(log n ⋅ (log_σ n)^{2/3}) time. Within a similar space, achieving SA queries in poly-logarithmic (even strongly sub-linear) time is a significant challenge. However, our 𝒪(n²log σ + n²log log n)-bit structure can support SA queries in 𝒪(n²/(σ log n)^c) time, where c is an arbitrarily large constant, which enables pattern matching in time faster than what is possible without preprocessing. We then design a repetition-aware data structure. The δ_2D compressibility measure for two-dimensional texts was recently introduced by Carfagna and Manzini [SPIRE 2023]. The measure ranges from 1 to n², with smaller δ_2D indicating a highly compressible two-dimensional text. The current data structure utilizing δ_2D allows only element access. We obtain the first structure based on δ_2D for LCE queries. It takes 𝒪^{~}(n^{5/3} + n^{8/5}δ_2D^{1/5}) space and answers queries in 𝒪(log n) time.
more » « less
Full Text Available
Photoacoustic image guidance for laser tonsil ablation: approach and initial results

https://doi.org/10.1117/12.3007295

Pacheco, Nicholas E; Gao, Shang; Cleary, Kevin; Shah, Rahul; Zhang, Haichong K; Fichera, Loris (March 2024, SPIE)
Rettmann, Maryam E; Siewerdsen, Jeffrey H (Ed.)
Tonsillectomy, one of the most common surgical procedures worldwide, is often associated with postoperative complications, particularly bleeding. Tonsil laser ablation has been proposed as a safer alternative; however, its adoption has been limited because it can be difficult for a surgeon to visually control the thermal interactions that occur between the laser and the tissue. In this study, we propose to monitor the ablation caused by a CO2 laser on ex-vivo tonsil tissue using photoacoustic imaging. Soft tissue’s unique photoacoustic spectra were used to distinguish between ablated and non-ablated tissue. Our results suggest that photoacoustic imaging is able to visualize necrosis formation and calculate the necrotic extent, offering the potential for improved tonsil laser ablation outcomes.
more » « less
Full Text Available
The parameter space for scalable integration of atomically thin graphene with Nafion for proton exchange membrane (PEM) applications

https://doi.org/10.1039/d3ma00180f

Chaturvedi, Pavan; Moehring, Nicole K; Knight, Thomas; Shah, Rahul; Vlassiouk, Ivan; Kidambi, Piran R (August 2023, Materials Advances)

Incorporating atomically thin graphene into proton exchange membranes (PEMs)viascalable and facile processes presents the potential for advancing energy conversion and storage applications while mitigating persistent issues of undesired species crossover.
more » « less
Full Text Available
Ranked Document Retrieval in External Memory

https://doi.org/10.1145/3559763

Shah, Rahul; Sheng, Cheng; Thankachan, Sharma; Vitter, Jeffrey (January 2023, ACM Transactions on Algorithms)

The ranked (or top-k) document retrieval problem is defined as follows: preprocess a collection{T₁,T₂,… ,T_d}ofdstrings (called documents) of total lengthninto a data structure, such that for any given query(P,k), wherePis a string (called pattern) of lengthp ≥ 1andk ∈ [1,d]is an integer, the identifiers of thosekdocuments that are most relevant toPcan be reported, ideally in the sorted order of their relevance. The seminal work by Hon et al. [FOCS 2009 and Journal of the ACM 2014] presented anO(n)-space (in words) data structure withO(p+klogk)query time. The query time was later improved toO(p+k)[SODA 2012] and further toO(p/log_σn+k)[SIAM Journal on Computing 2017] by Navarro and Nekrich, whereσis the alphabet size. We revisit this problem in the external memory model and present three data structures. The first one takesO(n)-space and answer queries inO(p/B+ log_Bn + k/B+log^*(n/B)) I/Os, whereBis the block size. The second one takesO(nlog^*(n/B)) space and answer queries in optimalO(p/B+ log_Bn + k/B)I/Os. In both cases, the answers are reported in the unsorted order of relevance. To handle sorted top-kdocument retrieval, we present anO(nlog(d/B))space data structure with optimal query cost.
more » « less
Full Text Available
Fully Functional Parameterized Suffix Trees in Compact Space

Ganguly, Arnab; Shah, Rahul; Thankachan, Sharma V. (June 2022, 49th International Colloquium on Automata, Languages, and Programming (ICALP 2022))
Mikolaj Bojanczyk; Emanuela Merelli; David P. Woodruff (Ed.)
Two equal length strings are a parameterized match (p-match) iff there exists a one-to-one function that renames the symbols in one string to those in the other. The Parameterized Suffix Tree (PST) [Baker, STOC' 93] is a fundamental data structure that handles various string matching problems under this setting. The PST of a text T[1,n] over an alphabet Σ of size σ takes O(nlog n) bits of space. It can report any entry in (parameterized) (i) suffix array, (ii) inverse suffix array, and (iii) longest common prefix (LCP) array in O(1) time. Given any pattern P as a query, a position i in T is an occurrence iff T[i,i+|P|-1] and P are a p-match. The PST can count the number of occurrences of P in T in time O(|P|log σ) and then report each occurrence in time proportional to that of accessing a suffix array entry. An important question is, can we obtain a compressed version of PST that takes space close to the text’s size of nlogσ bits and still support all three functionalities mentioned earlier? In SODA' 17, Ganguly et al. answered this question partially by presenting an O(nlogσ) bit index that can support (parameterized) suffix array and inverse suffix array operations in O(log n) time. However, the compression of the (parameterized) LCP array and the possibility of faster suffix array and inverse suffix array queries in compact space were left open. In this work, we obtain a compact representation of the (parameterized) LCP array. With this result, in conjunction with three new (parameterized) suffix array representations, we obtain the first set of PST representations in o(nlog n) bits (when logσ = o(log n)) as follows. Here ε > 0 is an arbitrarily small constant. - Space O(n logσ) bits and query time O(log_σ^ε n); - Space O(n logσlog log_σ n) bits and query time O(log log_σ n); and - Space O(n logσ log^ε_σ n) bits and query time O(1). The first trade-off is an improvement over Ganguly et al.’s result, whereas our third trade-off matches the optimal time performance of Baker’s PST while squeezing the space by a factor roughly log_σ n. We highlight that our trade-offs match the space-and-time bounds of the best-known compressed text indexes for exact pattern matching and further improvement is highly unlikely.
more » « less
Full Text Available
A Linear-Space Data Structure for Range-LCP Queries in Poly-Logarithmic Time

https://doi.org/10.1007/978-3-319-94776-1_51

Abedin, Paniz; Ganguly, Arnab and; Nekrich, Yakov; Sadakane, Kunihiko; Shah, Rahul; Thankachan, Sharma V (January 2018, International Computing and Combinatorics Conference)

Full Text Available
The CosmoVerse White Paper: Addressing observational tensions in cosmology with systematics and fundamental physics

https://doi.org/10.1016/j.dark.2025.101965

Di_Valentino, Eleonora; Said, Jackson Levi; Riess, Adam; Pollo, Agnieszka; Poulin, Vivian; Gómez-Valent, Adrià; Weltman, Amanda; Palmese, Antonella; Huang, Caroline D; Bruck, Carsten_van de; et al (September 2025, Physics of the Dark Universe)

The standard model of cosmology has provided a good phenomenological description of a wide range of observations both at astrophysical and cosmological scales for several decades. This concordance model is constructed by a universal cosmological constant and supported by a matter sector described by the standard model of particle physics and a cold dark matter contribution, as well as very early-time inflationary physics, and underpinned by gravitation through general relativity. There have always been open questions about the soundness of the foundations of the standard model. However, recent years have shown that there may also be questions from the observational sector with the emergence of differences between certain cosmological probes. In this White Paper, we identify the key objectives that need to be addressed over the coming decade together with the core science projects that aim to meet these challenges. These discordances primarily rest on the divergence in the measurement of core cosmological parameters with varying levels of statistical confidence. These possible statistical tensions may be partially accounted for by systematics in various measurements or cosmological probes but there is also a growing indication of potential new physics beyond the standard model. After reviewing the principal probes used in the measurement of cosmological parameters, as well as potential systematics, we discuss the most promising array of potential new physics that may be observable in upcoming surveys. We also discuss the growing set of novel data analysis approaches that go beyond traditional methods to test physical models. These new methods will become increasingly important in the coming years as the volume of survey data continues to increase, and as the degeneracy between predictions of different physical models grows. There are several perspectives on the divergences between the values of cosmological parameters, such as the model-independent probes in the late Universe and model-dependent measurements in the early Universe, which we cover at length. The White Paper closes with a number of recommendations for the community to focus on for the upcoming decade of observational cosmology, statistical data analysis, and fundamental physics developments
more » « less
Free, publicly-accessible full text available September 1, 2026

Search for: All records